November 5th 2020

First things first - Intro

Welcome to this workshop on ANNs in R

Assumed Workshop Prerequisites

  • Comfortable using and navigating the RStudio IDE for R dev
  • Experience with the Base and Tidyverse dialects of R
  • This workshop is introductory, so no prior knowledge of neural networks is required

General Workshop Info

  • Not all code details are important, so do not worry if not everything makes sense
  • All materials are and will remain open source under GPL-3.0 on GitHub, so you can revisit the entire workshop any time you would like to!
  • Per previous point, please do not record this workshop and do not take screen shots

Workshop Attendee Profiles

If we denote the R experience levels c(0, 1, 2, 3), then the average is 1.8

Workshop Learning Objectives

A participant who has met the objectives of this workshop will be able to:

  • Conceptually describe
    • What an ANN is
    • How an ANN is trained
    • How predictions are made
    • What ANN hyperparameters are
  • Create a simple dense ANN model in R using TensorFlow via Keras
  • Apply the created model to predict on data

Workshop Limitations

  • The aim of this workshop is to introduce you to artificial neural networks in R

  • The key here being introduce, we have limited time, so you will mainly be working with code I created

  • We have 4h in total, realistically we can only scratch the surface of deep learning

  • If you have little-to-no-experience with base R and/or Tidyverse, expect the workshop to feel overwhelming

  • Workshop materials will remain open, so my intention is, that you can revisit and study them further after todays workshop

Your host for the day will be… Me!

  • Leon Eyrich Jessen

  • I’m from Copenhagen, Denmark

  • Background in biotech engineering and a PhD in Bioinformatics

  • I am an Assistant Professor of Bioinformatics at Department of Health Technology, Technical University of Denmark

  • Head of the BioMLgroup focusing on development and application of machine learning in bioinformatics

  • If you are interested in bioinformatics, machine learning and data science, feel free to find me on Twitter: @jessenleon

On Artificial Neural Networks (ANNs)

What are ANNs?

  • A mathematical framework inspired by neuron network structure of the human brain

Source: Bruce Blaus | Multipolar Neuron | CC BY 3.0

  • In reality we do not really know how the human brain learns, we only know, that it is capable of processing “data”

“Inspired by neuron network structure”

  • If you google “Artificial Neural Networks”, you will get something like this:

  • Let’s demystify this…

“Inspired by neuron network structure”

  • \(I_i...I_n\): The input layer, \(B_I\): Bias/intercept
  • \(H_j...H_m\): The hidden layer, \(B_H\): Bias/intercept
  • \(O\): The output

Making a Prediction: The feed forward algorithm

Example: Fully Connected Neural Network

  • To put it simple, the input vector I is transformed to a prediction O

  • The input vector is simply the set of variables in your data for a single observation, e.g.

## # A tibble: 10 x 5
##    Sepal.Length Sepal.Width Petal.Length Petal.Width Species   
##           <dbl>       <dbl>        <dbl>       <dbl> <fct>     
##  1          6.5         3            5.8         2.2 virginica 
##  2          5           3            1.6         0.2 setosa    
##  3          6.7         3.3          5.7         2.5 virginica 
##  4          4.8         3.4          1.9         0.2 setosa    
##  5          6.1         2.8          4           1.3 versicolor
##  6          6.8         3            5.5         2.1 virginica 
##  7          6.4         3.2          5.3         2.3 virginica 
##  8          5           3.3          1.4         0.2 setosa    
##  9          4.5         2.3          1.3         0.3 setosa    
## 10          7.9         3.8          6.4         2   virginica
  • \(Species \sim f(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)\)

  • We can visualise this like so

Example: Fully Connected Neural Network

  • I is the input vector, H is the hidden layer and O is the output
  • B is the bias neuron, think intercept in the familiar \(y = b + a \cdot x\)

Example: Fully Connected Neural Network

  • Flow from input layer (features) to hidden layer:

    \(H_{j} = I_{i} \cdot v_{i,j} + I_{i+1} \cdot v_{i+1,j} + I_{i+...} \cdot v_{i+...,j} + I_{n} \cdot v_{n,j} + B_{I} \cdot v_{n+1,j} =\) \(\sum_{i}^{n} I_{i} \cdot v_{i,j} + B_{I} \cdot v_{n+1,j} = \sum_{i}^{n+1} I_{i} \cdot v_{i,j} = \textbf{I} \cdot \textbf{v}_j\)

  • Non-linear transformation of hidden layer input to hidden layer output (activation function):

    \(S(H_{j}) = \frac{1}{1+e^{-H_{j}}}\)

Example: Fully Connected Neural Network

  • Flow from hidden layer to output layer:

    \(O = H_{j} \cdot w_{j} + H_{j+1} \cdot w_{j+1} + H_{j+...} \cdot w_{j+...} + H_{m} \cdot w_{m} + B_{H} \cdot w_{m+1} =\) \(\sum_{j}^{m} H_{j} \cdot w_{j} + B_{H} \cdot w_{m+1} = \sum_{j}^{m+1} H_{j} \cdot w_{j} = \textbf{H} \cdot \textbf{w}\)

  • Non-linear transformation of output layer input to output layer output (activation function):

    \(S(O) = \frac{1}{1+e^{-O}}\)

Training a Network: The Back Propagation Algorithm

Example: Fully Connected Neural Network

  • Activation function (non-linearity):
    • \(S(x) = \frac{1}{1+e^{-x}}\)
  • Loss function (error):
    • \(E = MSE(O,T) = \frac{1}{2} \left( o - t \right)^2\)
  • Optimisation using gradient descent (weight updates):
    • \(\Delta w = - \epsilon \frac{\partial E}{\partial w}\)
    • \(\Delta v = - \epsilon \frac{\partial E}{\partial v}\)
    • Where \(\epsilon\) = learning rate

Activation Function Examples

Activation Function - Sigmoid

  • Low input and the neuron is turned off (emits 0)
  • Medium input and the neuron emits a number inbetween 0 and 1
  • High input and the neuron is turned on (emits 1)

Activation Function - Rectified Linear Unit

  • Input less than zero and the neuron is turned off (emits 0)
  • Input larger than zero and the neuron simply propagates the signal (emits x)

Activation Function - Leaky Rectified Linear Unit

  • Input less than zero and the neuron is almost turned off (emits a small number)
  • Input larger than zero and the neuron simply propagates the signal (emits x)

Activation Function - Output neuron(s)

  • Choice of activation function for output neuron(s) depend on aim

    • Binary Classification: Sigmoid
    • Multiclass Classification: Softmax, softmax\((x_i) = \frac{e^{x_i}}{\sum_{i=1}^{n} e^{x_i}}\)
    • Regression: Linear

Optimiser: Stochastic Gradient Descent

  • We need to find the value of our weight resulting in the smallest possible parameter cost
  • The optimisation cannot be solved analytically, so numeric approximations are used
  • E.g. SGD back-propagates the loss per single observation allowing fluctuations

Optimiser: Stochastic Gradient Descent

Summary

Key Terms and Concepts

  • Input layer: The first layer of neurons being fed the examples from the feature matrix
  • Hidden layer(s): The layers connecting the visible input and output layers
  • Output layer: The layer creating the final output (prediction)
  • Feed forward algorithm: The algorithm used to make a prediction, where information flows from the input via the hidden to the output layer
  • Activation function: The function used to make a non-linear transformation of the set of linear combinations feeding into a neuron
  • Back propagation algorithm: The algorithm used for iteratively training the ANN
  • Loss/error function: The function used to measure the error between the true and the predicted value, when training the ANN
  • Optimiser: The function used for optimising the weights, when training the ANN
  • Epoch: One run through all training examples
  • An ANN can do both binary and multiclass classification and also regression

Time for exercises!

  • Please proceed to the exercise on prototyping an ANN

  • At the exercises, I will strongly advice to pair up two-and-two, so you can discuss

  • Internalising knowledge is much more effecient, when you are forced to put concepts into words

  • GitHub repo for this workshop is: https://github.com/leonjessen/RPharma2020